Significance of Joint Features Derived from the Modified Group Delay Function in Speech Processing

نویسندگان

  • Rajesh M. Hegde
  • Hema A. Murthy
  • Venkata Ramana Rao Gadde
چکیده

This paper investigates the significance of combining cepstral features derived from the modified group delay function and from the short-time spectral magnitude like the MFCC. The conventional group delay function fails to capture the resonant structure and the dynamic range of the speech spectrum primarily due to pitch periodicity effects. The group delay function is modified to suppress these spikes and to restore the dynamic range of the speech spectrum. Cepstral features are derived from the modified group delay function, which are called the modified group delay feature (MODGDF). The complementarity and robustness of the MODGDF when compared to the MFCC are also analyzed using spectral reconstruction techniques. Combination of several spectral magnitude-based features and the MODGDF using feature fusion and likelihood combination is described. These features are then used for three speech processing tasks, namely, syllable, speaker, and language recognition. Results indicate that combining MODGDF with MFCC at the feature level gives significant improvements for speech recognition tasks in noise. Combining the MODGDF and the spectral magnitude-based features gives a significant increase in recognition performance of 11% at best, while combining any two features derived from the spectral magnitude does not give any significant improvement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Group delay functions and its applications in speech technology

Traditionally, the information in speech signals is represented in terms of features derived from short-time Fourier analysis. In this analysis the features extracted from the magnitude of the Fourier transform (FT) are considered, ignoring the phase component. Although the significance of the FT phase was highlighted in several studies over the recent three decades, the features of the FT phas...

متن کامل

Continuous speech recognition using joint features derived from the modified group delay function and MFCC

Feature extraction and selection for continuous speech recognition is a complex task. State of the art speech recognition systems use features that are derived by ignoring the Fourier transform phase. In our earlier studies we have shown the efficacy of The Modified Group Delay Feature (MODGDF) derived from the Fourier transform phase for phoneme, syllable and speaker recognition. In this paper...

متن کامل

The modified group delay feature: a new spectral representation of speech

Automatic recognition of speech by machines begins with extraction of meaningful features from the speech signal. Conventional features like the MFCC are derived from the Fourier transform magnitude spectrum, while totally ignoring the phase spectrum. The importance of the Modified group delay feature (MODGDF) derived from the Fourier transform phase spectrum for speaker and phoneme recognition...

متن کامل

Using group delay functions from all-pole models for speaker recognition

Popular features for speech processing, such as mel-frequency cepstral coefficients (MFCCs), are derived from the short-term magnitude spectrum, whereas the phase spectrum remains unused. While the common argument to use only the magnitude spectrum is that the human ear is phase-deaf, phase-based features have remained less explored due to additional signal processing difficulties they introduc...

متن کامل

Dimensionality reduction methods applied to both magnitude and phase derived features

A number of previous studies have shown that speech sounds may have an intrinsic low dimensional structure. Such studies have focused on magnitude-based features ignoring phase information, as is the convention in many speech processing applications. In this paper dimensionality reduction methods are applied to MFCC and modified group delay function (MODGDF) features derived from the magnitude ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Audio, Speech and Music Processing

دوره 2007  شماره 

صفحات  -

تاریخ انتشار 2007